Exploratory Geovisualization with PySAL

Introduction

When PySAL was originally planned, the intention was to focus on the computational aspects of exploratory spatial data analysis and spatial econometric methods, while relying on existing GIS packages and visualization libraries for visualization of computations. Indeed, we have partnered with esri and QGIS towards this end.

However, over time we have received many requests for supporting basic geovisualization within PySAL so that the step of having to interoperate with an exertnal package can be avoided, thereby increasing the efficiency of the spatial analytical workflow.

In this notebook, we demonstrate several approaches towards geovisualization within a self-contained exploratory workflow. The idea here is the support quick generation of different views of your data to complement the statistical and econometric work in PySAL. Once your work has progressed to the publication stage, we point you to resources that can be used for publication quality output.

PySAL Viz Module

Contributors:

This document describes the main structure, components and usage of the mapping module in PySAL. The is organized around three main layers:

  • A lower-level layer that reads polygon, line and point shapefiles and returns a Matplotlib collection.
  • A medium-level layer that performs some usual transformations on a Matplotlib object (e.g. color code polygons according to a vector of values).
  • A higher-level layer intended for end-users for particularly useful cases and style preferences pre-defined (e.g. Create a choropleth).

In [1]:
import numpy as np
import pysal as ps
import random as rdm
from pysal.contrib.viz import mapping as maps
%matplotlib inline
from pylab import *

Lower-level component

This includes basic functionality to read spatial data from a file (currently only shapefiles supported) and produce rudimentary Matplotlib objects. The main methods are:

  • map_poly_shape: to read in polygon shapefiles
  • map_line_shape: to read in line shapefiles
  • map_point_shape: to read in point shapefiles

These methods all support an option to subset the observations to be plotted (very useful when missing values are present). They can also be overlaid and combined by using the setup_ax function. the resulting object is very basic but also very flexible so, for minds used to matplotlib this should be good news as it allows to modify pretty much any property and attribute.

Example


In [2]:
shp_link = ps.examples.get_path('columbus.shp')
shp = ps.open(shp_link)
some = [bool(rdm.getrandbits(1)) for i in ps.open(shp_link)]

fig = figure()

base = maps.map_poly_shp(shp)
base.set_facecolor('none')
base.set_linewidth(0.75)
base.set_edgecolor('0.8')
some = maps.map_poly_shp(shp, which=some)
some.set_alpha(0.5)
some.set_linewidth(0.)
cents = np.array([poly.centroid for poly in ps.open(shp_link)])
pts = scatter(cents[:, 0], cents[:, 1])
pts.set_color('red')

ax = maps.setup_ax([base, some, pts], [shp.bbox, shp.bbox, shp.bbox])
fig.add_axes(ax)
show()


Medium-level component

This layer comprises functions that perform usual transformations on matplotlib objects, such as color coding objects (points, polygons, etc.) according to a series of values. This includes the following methods:

  • base_choropleth_classless
  • base_choropleth_unique

Example


In [21]:
net_link = ps.examples.get_path('eberly_net.shp')
net = ps.open(net_link)
values = np.array(ps.open(net_link.replace('.shp', '.dbf')).by_col('TNODE'))

pts_link = ps.examples.get_path('eberly_net_pts_onnetwork.shp')
pts = ps.open(pts_link)

fig = figure()

netm = maps.map_line_shp(net)
netc = maps.base_choropleth_unique(netm, values)

ptsm = maps.map_point_shp(pts)
ptsm = maps.base_choropleth_classif(ptsm, values)
ptsm.set_alpha(0.5)
ptsm.set_linewidth(0.)

ax = maps.setup_ax([netc, ptsm], [net.bbox, net.bbox])
fig.add_axes(ax)
show()



In [22]:
maps.plot_poly_lines(ps.examples.get_path('columbus.shp'))


callng plt.show()

Higher-level component

This currently includes the following end-user functions:

  • plot_poly_lines: very quick shapfile plotting

In [23]:
shp_link = ps.examples.get_path('columbus.shp')
values = np.array(ps.open(ps.examples.get_path('columbus.dbf')).by_col('HOVAL'))

types = ['classless', 'unique_values', 'quantiles', 'equal_interval', 'fisher_jenks']
for typ in types:
    maps.plot_choropleth(shp_link, values, typ, title=typ)


Folium

Contributors:

In addition to using matplotlib, the viz module includes components that interface with the folium library which provides a Pythonic way to generate Leaflet maps.


In [24]:
import pysal as ps
import geojson as gj
from pysal.contrib.viz import folium_mapping as fm

First, we need to convert the data into a JSON format. JSON, short for "Javascript Serialized Object Notation," is a simple and effective way to represent objects in a digital environment. For geographic information, the GeoJSON standard defines how to represent geographic information in JSON format. Python programmers may be more comfortable thinking of JSON data as something akin to a standard Python dictionary.


In [25]:
filepath = ps.examples.get_path('south.shp')[:-4]
shp = ps.open(filepath + '.shp')
dbf = ps.open(filepath + '.dbf')

In [26]:
js = fm.build_features(shp, dbf)

Just to show, this constructs a dictionary with the following keys:


In [27]:
js.keys()


Out[27]:
[u'type', 'bbox', 'features']

In [28]:
js.type


Out[28]:
'FeatureCollection'

In [29]:
js.bbox


Out[29]:
[-106.6495132446289, 24.95596694946289, -75.0459976196289, 40.63713836669922]

In [30]:
js.features[0]


Out[30]:
{"bbox": [-80.6688232421875, 40.39815902709961, -80.52220916748047, 40.63713836669922], "geometry": {"coordinates": [[[-80.6280517578125, 40.39815902709961], [-80.60203552246094, 40.480472564697266], [-80.62545776367188, 40.504398345947266], [-80.6336441040039, 40.53913879394531], [-80.6688232421875, 40.568214416503906], [-80.66793060302734, 40.58207321166992], [-80.63754272460938, 40.61391830444336], [-80.61175537109375, 40.619998931884766], [-80.57462310791016, 40.615909576416016], [-80.52220916748047, 40.63713836669922], [-80.52456665039062, 40.47871780395508], [-80.52377319335938, 40.4029655456543], [-80.6280517578125, 40.39815902709961]]], "type": "Polygon"}, "properties": {"BLK60": 3.839454752, "BLK70": 3.2554278095, "BLK80": 2.5607402642, "BLK90": 2.5572616581, "CNTY_FIPS": "029", "COFIPS": 29, "DNL60": 6.1681225056, "DNL70": 6.1714993547, "DNL80": 6.1714631077, "DNL90": 6.0508978146, "DV60": 2.2779893943, "DV70": 2.5591397849, "DV80": 5.0619350519, "DV90": 7.2636377003, "FH60": 9.9812973718, "FH70": 7.8, "FH80": 9.7857968181, "FH90": 12.604551644, "FIPS": "54029", "FIPSNO": 54029, "FP59": 9.6, "FP69": 5.9, "FP79": 6.5327526442, "FP89": 10.17311807, "GI59": 0.2236450331, "GI69": 0.2953773833, "GI79": 0.3322512119, "GI89": 0.3639335641, "HC60": 0.6666666667, "HC70": 1.6666666667, "HC80": 2.6666666667, "HC90": 0.3333333333, "HR60": 1.6828642349, "HR70": 4.1929776011, "HR80": 6.5977204876, "HR90": 0.9460827444, "MA60": 28.9, "MA70": 30.0, "MA80": 31.4, "MA90": 37.7, "MFIL59": 8.8410143105, "MFIL69": 9.2471543451, "MFIL79": 10.073356901, "MFIL89": 10.327970666, "NAME": "Hancock", "PO60": 39615, "PO70": 39749, "PO80": 40418, "PO90": 35233, "POL60": 10.586963113, "POL70": 10.590339963, "POL80": 10.607030509, "POL90": 10.469738422, "PS60": 1.218684208, "PS70": 1.1368342185, "PS80": 1.0385705291, "PS90": 0.8964534429, "RD60": -1.394676863, "RD70": -1.307438562, "RD80": -1.159302086, "RD90": -0.399028376, "SOUTH": 1, "STATE_FIPS": "54", "STATE_NAME": "West Virginia", "STFIPS": 54, "UE60": 3.1, "UE70": 2.7, "UE80": 7.0763827919, "UE90": 6.8578070515}, "type": "Feature"}

Then, we write the json to a file:


In [31]:
with open('./example.json', 'w') as out:
    gj.dump(js, out)

Mapping

Let's look at the columns that we are going to map.


In [32]:
list(js.features[0].properties.keys())[:5]


Out[32]:
[u'HR90', u'PS90', u'FH80', u'HC60', u'FP79']

We can map these attributes by calling them as arguments to the choropleth mapping function:


In [15]:
import folium as fol

In [16]:
fm.choropleth_map('./example.json', 'FIPS', 'HR90')


/home/serge/anaconda2/lib/python2.7/site-packages/folium/folium.py:504: UserWarning: This method is deprecated. Please use Map.choropleth instead.
  warnings.warn('This method is deprecated. '
Out[16]:

This produces a map using default classifications and color schemes and saves it to an html file. We set the function to have sane defaults. However, if the user wants to have more control, we have many options available.

There are arguments to change the classification scheme:


In [17]:
fm.choropleth_map('./example.json', 'FIPS', 'HR90', classification = 'Quantiles')


Out[17]:

Most PySAL classifiers are supported.

Base Map Type


In [18]:
fm.choropleth_map('./example.json', 'FIPS', 'HR90', classification = 'Jenks Caspall', tiles='Stamen Toner', save=True)


Out[18]:

We support the entire range of builtin basemap types in Folium, but custom tilesets from MapBox are not supported (yet).

Color Scheme


In [19]:
fm.choropleth_map('./example.json', 'FIPS', 'HR80', classification = 'Jenks Caspall', tiles='Stamen Toner', fill_color = 'PuBuGn', save=True)


Out[19]:

All color schemes are Color Brewer and simply pass through to Folium on execution.

Class numbers


In [20]:
fm.choropleth_map('./example.json', 'FIPS', 'HR80', classification = 'Equal Interval', classes=6, tiles='Stamen Toner', fill_color='PuBuGn',save=True)


Out[20]:

Folium supports up to 6 classes.

Cartopy

Althought we don't have time to go into the details here today, we note that for for publication ready maps one can turn to Cartopy. For an example of a recent publication and code see Rey (2016).


In [ ]: